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Abstract 

o 

Perfect sampling is a technique that uses couphng arguments to 
^ provide a sample from the stationary distribution of a Markov chain 

in a finite time without ever computing the distribution. This tech- 
' ' nique is very efficient if all the events in the system have monotonicity 

0^ property. However, in the general (non-monotone) case, this technique 

' needs to consider the whole state space, which limits its application 

I— I only to chains with a state space of small cardinality. We propose 

^) here a new approach for the general case that only needs to consider 

two trajectories. Instead of the original chain, we use two bounding 
processes (envelopes) and we show that, whenever they couple, one ob- 
^ tains a sample under the stationary distribution of the original chain. 

1 I We show that this new approach is particularly effective when the state 

space can be partitioned into pieces where envelopes can be easily com- 
puted. We further show that most Markovian queueing networks have 
this property and we propose efficient algorithms for some of them. 

^\ Keywords: Markov chains, perfect sampling, queueing systems. 
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2 1 Introduction 

.!_, Simulation approaches can be efficient ways to estimate the stationary be- 

^ havior of Markov chains by providing independent samples distributed ac- 

cording to their stationary distributions, even when it is impossible to com- 
pute this distribution numerically. 

Propp and Wilson used a backward coupling scheme [I4j to derive a 
simulation algorithm - called PSA (Perfect Sampling Algorithm) in the fol- 
lowing - providing perfect samples (i.e. whose distribution is stationary) of 
the state of a discrete time Markov chain over a finite state space S. The 
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main idea is to simulate trajectories starting from all states x £ S at some 
time in the past until time t = 0. If the state at time is the same for all 
trajectories, then the chain has coalesced and the final state has the station- 
ary distribution of the Markov chain. Otherwise, the simulations are started 
further in the past. Perfect sampling procedures have been developed in var- 
ious contexts (for more information, see the annotated bibliography [16]). 

Although the Perfect Sampling Algorithm provides perfect samples for 
all irreducible and aperiodic finite Markov chains, its complexity can be 
high. It suffers from two drawbacks that jeopardize its applicability for very 
large chains. 

• The first one is the fact that the coupling time can be very large. Some 
recent work focused on the estimation of the coupling time for certain 
classes of Markov chains. For example, it was shown in [4j that Markov 
chains, modeling a class of networks of queues with finite capacities, 
have a quadratic coupling time with respect to capacities. 

• The second factor in the complexity of PSA is the fact that one needs 
to run one simulation per state in S, which is prohibitive for most 
applications. Various techniques have been developed to reduce the 
number of trajectories that need to be considered in the coupling from 
the past procedure. A first crucial observation already appears in [H] : 
for a monotone Markov chain, one has to consider only extremal initial 
conditions. For anti-monotone systems, an analogous technique that 
also considers only extremal initial conditions has been developed by 
Kendall [11] and Haggstrom and Nelander [8]. 

To cope with non-monotonicity, Kendall and M0ller introduced in [12] 
the general idea of two bounding processes sandwiching all the normal 
trajectories, reducing the number of processes having to be computed 
to two. They applied the sandwiching technique to perfect sampling 
of point processes, together with the idea of a dominating chain that 
allowed to handle the infinite state space. The idea of sandwiching 
processes was also applied in communication networks [U [7] . Huber 
[9j also introduced a similar idea of bounding chains for determining 
when coupling has occurred. In his model, the bounding chains are 
evolving in some bigger state space than the target Markov chain. 
However, both Kendall-M0ller's and Ruber's constructions are model- 
dependent and do not have an algorithmic counterpart so that they 
are not straightforward to apply in general. 

In this paper we introduce a new perfect sampling algorithm (EPSA: 
Envelope Perfect Sampling Algorithm) that uses the lattice structure of the 
state space to design an effective and automatic construction of a bounding 
interval chain that can be expressed using only two trajectories. This is done 
by computing from the past, a lower and an upper bound of all trajectories 
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of the Markov chain. Whenever the two bounds meet, the couphng state 
is distributed according to the stationary distribution of the chain. The 
idea of bounding interval chain is close to sandwiching processes in Kendall 
and M0ller [T^. However, their approach was not constructive and the 
sandwiching processes were computed ad-hoc for each application. 

Our aim is to propose, under a constructive approach, several algorithms 
to compute bounding interval chains that are easily computable and as tight 
as possible. This is done in two cases. The first (easy) case is when events 
are almost state space homogeneous (defined in Section |4]). In that case, 
we provide an algorithm that computes the update rule of the bounding 
interval chain. The second case is more difficult since events are assumed to 
be piecewise defined (Sectionjs]) over arbitrary polytopes (Section[6]). In that 
case we show how to obtain an interval containing all trajectories by solving 
a set of linear programs. Since the worst case complexity of linear programs 
may be too high, we show how to avoid solving these, at the expense of 
obtaining bounding intervals that may be less tight. 

In the last part of the paper (Section [7|), we consider applications to 
queueing networks. We exhibit several events (such as negative customers, 
fork and join, batch routing) that satisfy almost state space homogeneity. 
We also provide numerical evidence of the very good performance of EPSA 
in these cases, by computing the coupling time experimentally. 

We also show examples (such as generalized join the shortest queue) 
that are piecewise homogeneous, where the smallest interval containing all 
trajectories can also be computed in linear time in the number of queues. 
Our numerical experiments show that EPSA behaves just as well as in the 
easy case of almost space homogeneous events. 

2 Definitions and Basic Properties 

If not stated otherwise, all real (or integer) vectors are row vectors and 
denotes the transposed vector. For x,y £ M'^, we denote x V y *== 
(max{xi, yi},---, 
max{xd, Vd}), and x A y =^ (min{xi, yi}, . . . , min{xrf, y^}). To avoid 
multiple brackets, for x,y, z € M'^, we will write: 

X V y A z {x V y) A z. 

Note that (x V y) A z ^ x V (y A z); the relations involving operators V and 
A will be evaluated from left to right. For any x G M", ||x||i = X]r=i l-*^*!' 
and for any set U, \U\ denotes its cardinality. 

Throughout the paper, we consider discrete time homogeneous Markov 
chains (DTMC) with a finite state space S. Since we only consider finite 
state spaces, the continuous time models can be easily handled using the 
same techniques, after uniformization. 
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2.1 Markov Automaton 



The evolution of a finite DTMC can always be obtained using a finite number 
of discrete events (or actions). To formalize this, we introduce the notion 
of Markov automaton. Markov automatons are similar to Generalized Semi 
Markov Processes p] with a focus on state changes rather than on time. 

Definition 2.1. A Markov automaton is a tuple A = {S,A,D,-) where 
S is a finite state space, A is an alphabet called the set of events, D is a 
probability distribution on A, and ■ is a right action by the letters of A on 
the states in S and is called the transition function ; 

: S X A^ S,{x,a) ^ X • a. 

Equivalently, this action can be most naturally extended to a right action by 
words ui^n '= ui . . . Un in A" on S for any n G N; 

: S X A^ — )• 5, (x, Ui-^n) ^ X ■ Ui^n ^= X ■ Ui ■ U2 ■ ■ ■ . ■ Un- 

A Markov automaton A = (S, A, D, •) naturally induces a Markov chain 
with the following construction: 

Let (mi, . . . , n„, . . . ) be an infinite sequence of random letter of A i.i.d. dis- 
tributed according to D. We consider this sequence as an infinite word 
u =^ ui . . . n„ . . . on A. Then for any xq G 5, the random process (X„ =^ 
xq ■ ui^n)nm is a Markov chain issued from xq with probability transition 
matrix P given by 

foranx,yin5, P{x,y) = ^ Foia). (2.1) 

x-a=y 

We will say that the Markov chain {Xn) is generated by A and u. Moreover, 
with this construction, we can build a family of Markov chains {(X„(x) = 
x ■ ui^n)n>o \ X £ S} Starting from each state x £ S. We will refer to this 
family of Markov chain as the grand coupling generated by A and u. We will 
say that the grand coupling has coalesced at time n if all the Markov chains 
of the family has reached the same state, i.e. when S ■ ni_s.„ is reduced to a 
singleton. 

Conversely, for any probability transition matrix P on a finite space state 
S, it is easy to see that there exists a Markov automaton A = {S,A,D, •) 



such that (2.1) holds, i.e. such that A generates a Markov chain on S with 
same transition matrix P, but that automaton is in general not unique. 
However as we will see in our examples, such representation of Markovian 
systems with Markov automatons naturally arises for many systems such as 
queueing networks. 
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2.2 Perfect Sampling 

Let be an irreducible and aperiodic DTMC with a finite state space 

S and a transition matrix P. Let vr denote the steady state distribution of 
the chain: vr = ttP. 

The Perfect Samphng Algorithm (PSA) was first used by Propp and 
Wilson. It allows one to draw in finite time steady state distributed variable 
using coupling from the past. 

Consider a Markov automaton A = (5, A, D, •) and {uq, U-i, . . . , U-n, • • • ) 
a sequence of events i.i.d. with distribution D on A. 

Theorem 2.2 (Propp and Wilson [14]). There exists ^ € N such that 
lim \S ■ U-n+i^o\ = i almost surely. 

The grand coupling generated by A and u couples if i = 1. In that case, 
let T *== inf i^n : \S ■ U-n+i^o\ = 1 be the coupling time of the chain. 
Then E(t) < oo and S ■ u^r+i^o is steady state distributed. 



Algorithm 1: Perfect Sampling Algorithm (PSA) 
Data: Li.d. events {w-nl^g^ ^ 

Result: A state x* € S generated according to the stationary 
distribution of the Markov chain 

begin 

n := 0; 

foreach state x e S do S[x] := x; 

// S[x] is the state of the trajectory issued from x at 

time —n. 
repeat 

foreach state x £ S do R[x] := S[x ■ S := R; 

n := n + 1; 
until \{S[x],xe S}\ = 1 
(x* denotes this unique value); 
return x* 



The algorithmic counterpart (PSA) of Theorem 2.2 is given in Algo- 
rithm [T] The main drawback of PSA algorithm is the fact that one needs 
to simulate one Markov chain starting from each state in S, that could be 
too large for a practical use of the algorithm. 

Several approaches have been used to overcome this problem. The main 
one for a partially ordered state space (5, ^) and monotone events is already 
present in [Tj 
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Definition 2.3. An event a €z A is said to be monotone if x ^ y =^ x-a ^ 

y ■ a. An event a G A is said to be anti-monotone if x <y =^ x ■ a ^ y ■ a. 

If all events are monotone, then it is possible to generate a steady state 
by starting Algorithm [T] with maximal and minimal states only |14| . This 
technique has been successfully used in [15] to construct PSA for network 
of queues. Similarly, an efficient perfect sampling algorithm for the chains 
with anti- monotone events was given in [8]. 

When events are neither monotone nor anti-monotone, one can still use 
monotone bounds, as in [5J, but this technique provides a sample from a 
bounding chain and not the initial one. We give in this paper an algorithm 
for perfect sampling of general (non-monotone) chains. 

3 Envelopes 

The approach proposed here generalizes what has been done in [6j and [2] 
to sample non-monotone Markov chains. It uses a bounding chain method 
similar to the one employed by Kendall and M0ller in |12j . The sandwiching 
chains in [T^ are in general difficult to compute, except for some models for 
which effective ad- hoc rules can be derived. Our approach is more algorith- 
mic. We introduce a general framework that provides efficient algorithms 
for a wide variety of events, including many queueing events. 

3.1 Bounding Interval Chains 

We assume from now on that the state space S is equipped with a lattice 
order relation ^. 

For m,,M £ S, we will denote by [m, M] = {x G S : m < x < M} the 
lattice interval (or simply interval) between the endpoints m and M (note 
that [m, M] 7^ if and only if m ^ M). We introduce X as the set of all 
nonempty lattice intervals: X = {[m, M\ : m, M G S, m ^ M}. 

We define bounding interval chains as follows. 

Definition 3.1. Given a grand coupling {{Xn{x))n>o \ x G S}, we call 
bounding interval chain of that grand coupling any Markov chain of nonempty 
intervals ([m„,M„])„>o such that for all x in S and all n > 0, Xn{x) E 
[m„,Mn]. 

In particular we notice that when = M„, the grand coupling inside 
has necessary coalesced. Thus, by constructing efficiently such bounding 
intervals, we can detect easily if a grand coupling has coalesced and apply 
it to PSA. 

We now construct the tightest bounding interval chain computable step 
by step for a given grand coupling, that we will call envelope chain. 
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sup U 



Figure 1: Intervals in dimension two 
for the natural product order. The 
set U is the set of all the black points. 
The interval lU} is displayed as a 
rectangle. 




Figure 2: Envelopes of the action of a 
on an interval. The interval [m, M] is 
displayed as black points. The action 
of a is displayed as arrows. The en- 
velope interval [m' , M'] = [m, M] □ a 
is displayed as a rectangle. 



For any set U C 5, we denote by [{/J and we call envelope of U the 

smallest interval of S (for inclusion) containing U : lU} =^ [inf(L'^), sup(C/)]. 
This is illustrated by Figure [T} We introduce a new transition operator 
□ : I X A ^ I called envelope transition that transforms intervals into 
intervals and is defined by : for all [m, AI] G I and a £ A: 



m,M]Ba=^ l[m,M]-aj 



inf {x • a} , sup {x • a} 

m^x^M m-<x-<M 



(3.1) 



The construction of □ is illustrated on a simple example in Figure [2j 

As with we extend the operator to a finite word of events ui^n 
til . . . u„ G A"^: 



[m, M] □ ni_ 



dof 



[m, M] □ ui □ U2 □ . . . □ tin- 



Let us call _L =^ inf S (resp . 
of S. The process 



def 



sup 5) the bottom (resp. top) element 



[m„, Mn] =^ [-L, T] H tii^n 

is a Markov chain over the state space 5x5, called the envelope chain, and 
is a bounding interval chain of the grand coupling {(x • tii-s-n) | x G S}. 

It is also the tightest bounding chain computable step by step, in the 
sense that if, at time n, we only know that S ■ ui^n is in [m„,M„], then 
the tightest interval containing S ■ ui_^„+i we can compute at time n + 1 is 
[mn, Mn] □ Un+1- However, it is not tight in the stronger sense of [m„, M„] 
being equal to IS • ui^n} , that does not hold in general. 

Notice that the lower envelope {mn)ne'N alone is not a Markov chain, 
neither is the upper one {Mn)neN, since they depend on each other. 

We can then use the envelope process to detect coupling in PSA. 
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Theorem 3.2. Assume that [mn,Mn] hits the set of single point intervals 
X = {[x,x] : X £ S} in finite time: 

Te =^ mm|n : [_L, T] H u_„+i^o G '^j, 

then Te is a backward coupling time of the envelope chain. The state defined 
by [-L, T] H ti-re+i->o has the steady state distribution of DTMC {Xn}n&i- 

Proof. The proof simply uses the fact that the set S ■ n__„+i_>o is included 
in [-L, T] □ u-n+i^o- Consequently, if the latter is reduced to one point, so 
is the former and the grand coupling has coalesced. □ 

Algorithm [T] can be adapted to the envelope simulation: start the simula- 
tion with only two states, _L and T, and iterate using □ and an i.i.d. sequence 
of events. This new algorithm, called EPSA (Envelope Perfect Sampling Al- 
gorithm) , is given in Algorithm [2j The reason why we double n at each loop 
of the algorithm is that we need to compute S\I]U-n+i^Q in each loop, which 
corresponds to n iterations of □ . While increasing n by 1 at each loop would 
lead to a quadratic cost in n, doubling it keeps the complexity linear. 



Algorithm 2: Envelope Perfect Sampling Algorithm (EPSA) 

Data: I.i.d. events {M-n}„gN ^ -^^i (pre-computed) envelope 
operation □ 

Result: A state x* G 5 generated according to the stationary 
distribution of the Markov chain 

begin 

n = 1; m := _L; M := T; 
repeat 

for i = n — 1 downto do 

[_ [m,M] := [m,M] H n_i ; 

n := 2n; 
until m = M; 
X* := m; 
return x* ; 



EPSA is a generalization of both monotone and anti-monotone sampling 
algorithm (see [IT, 'S]): 

• Monotone events: if an event a G ^ is monotone, then for any m ^ M, 

[m, M]Ba=[m-a, M -a]. (3.2) 

• Anti-monotone events: if an event a G ^ is anti-monotone, then for 
any m ^ M, [m, M] □ a = [M - a, m ■ a]. 
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In the next section we discuss the complexity of EPSA, that depends 
on the couphng time of the envelope chain, but also on the complexity of 
the computation of the envelope transition. The efficiency of the (anti-) 
monotone case is due to the fact that when ' • ' is (anti-)monotone, the 
complexity of □ is the same as using ' • ' twice. 

Note however that the construction of the envelopes depends on the 
representation of the Markov chain by a Markov automaton that is not 
unique. Different automatons representing the same chain would lead to 
different envelope chains with different coupling properties (one may coalesce 
almost surely and the other not, or if they both coalesce their coupling times 
may be different). The complexity of the envelope transition may also differ 
depending on the representation chosen. 



3.2 Complexity of EPSA 

The envelope approach may not gain over the general PSA because of two 
drawbacks: 



{Di) The assumption that [mn,Mn] hits the set of single point intervals X 



in Theorem 3.2 may not hold. 



{D2) Even if the assumptions for Theorem 3.2 are satisfied, the complexity 
of the algorithm may be prohibitive. The average complexity of EPSA 
is 0{Ce X E(re)) where Te is the number of iterations of the main loop 
of EPSA (called the coupling time in the following) and Ce is the 
complexity (number of elementary operations) of computing [m, M] □ 
a. Meanwhile, the average complexity of the classical PSA is 0(C x 
|5| X E(t)), where r is the coupling time of PSA, |5| the cardinality 
of the state space and C the complexity of the computation of x • a. 



Remark 3.3. It is important to notice that the operation □ can be replaced 
by any over- approximation such that for all interval [m, M], [m, M] □ a C 
[m, M] a, without altering the correctness of the algorithm. By doing so, 
we would still obtain a bounding interval chain, however its coupling time 
might be longer than for the envelope chain, or the bounding interval chain 
might not couple at all. There is then an obvious tradeoff between computing 
simple large bounds quickly with an increased coupling time, or computing 
tight bounds slowly with a potentially shorter coupling time. 

Remark 3.4. Actually, even if the envelope chain does not couple, it is 
still possible to obtain perfect samples in finite time by using the splitting 
algorithm f^. The splitting algorithm is hybrid: it first runs EPSA when 
the envelopes are too far apart and switches to the usual PSA algorithm as 
soon as the number of states inside the envelopes becomes manageable. 
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The rest of this paper focuses on improving the cost Cg of computing 
envelopes under several assumptions except for Section 4.2 and numerical 
examples in Section [7] that discuss the coupling time issue. 

Indeed, if Ce is not too large compared to C (as it is the case for the 
various transitions proposed in this article), then the comparison of the two 
methods is essentially a comparison between the coupling time of EPSA 
(E(re)) and the coupling time of PSA multiplied by the cardinality of the 
state space (|5| x E(r)). 



4 Almost Space Homogeneous Events (ASHEs) 

We now present a family of simple events called Almost Space Homogeneous 
Events (ASHEs) for which the corresponding envelope transitions are easy 
to compute. These events will be the first brick to build more elaborate ones. 
Intuitively, these events consist in moving homogeneously every point of the 
space in the same direction, with some simple conditions on the boundaries 
of the state space. 



4.1 Computing the envelopes for ASHEs 

We assume from now on that S is an interval of Z'^: S = [0, C], where 
C = (Ci, . . . , Cd) G endowed by the usual product partial order: x < 
y Xi < Ui, Vi. Therefore for any m,M G S, we have [m,M] = 

[nil, Ml] X • • • X [md,Md]. 

Definition 4.1. An event a is an almost space homogeneous event (ASHE) 
if there exists a vector w S Z"' and a binary relation IZ (called blocking 
relation j on the set of components {1, ... ,6?} such that for all x in S we 
have the following property: 

(P) Let CR{x) '= {i : Xi+Vi [0, Cj]} he the set of the critical components 

(i.e. where x + v is out of the state space), and B{x) *== {i : 3j S 
CR{x), {j,i) G TZ} the set of all blocked components in state x. Then 
for all i : 

(x.a)=h'' i(^Bix), 
\{xi + Vi)VOACi, i^B{x). 

In particular, when x + v G S, we have x ■ a = x + v. 
We call V the direction vector of event a. 

def 

Remark 4.2. The blocking relation TZ needs only to be specified on Act = 
{i \ Vi ^ 0} the set of active components of the transition, since the event 
has an effect only along these components. 
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The cases B{x) = or B{x) = d are interesting particular cases that 
correspond respectively to the case where, if x + w 5, then the event is 
truncated at the boundary of the state space: x ■ a = (x + V A C, or the 
event is blocked: x ■ a = x. 

Example 4.3. Here is a more complex case in the context of queues. Let us 
consider a Markovian queueing network made of 3 finite queues and expo- 
nential service times and a fork node: one departing packet from queue 1 is 
duplicated into two packets joining queues 2 and 3. This event is an ASHE, 
corresponds to adding a vector v = (— 1,+1,+1) to all states except on the 
boundaries. The critical cases are when the first queue is empty or when one 
of the next queues is full. If the first queue is empty, then the event is blocked 
so that (1, 2) and (1, 3) are in TZ. If either queue 2 or queue 3 are full, then 
the packet actually leaves queue 1 and does not join the other queues (both 
duplicates are lost). This means that both (2, 3), (3, 2) are in TZ. These are 
the only constraints, thusTZ = {(1, 2), (1, 3), (2, 3), (3, 2)}. Other cases could 
also be taken into account, for example, the case where a full queue does not 
prevent the arrival in the other queue, corresponding to TZ = {(1,2), (1,3)}, 
or when a full queue blocks the departure from the first queue, corresponding 
ton = {(1, 2), (1, 3), (2, 1), (2,3), (3, 1), (3, 2)}. 




Figure 3: Fork node 

The computation of envelopes for a fixed ASHE a is given in Algorithm [3j 

Theorem 4.4. Algorithm^ computes [m! ,M'] = [m, M] □ a in 0{d^) ele- 
mentary operations. 

The proof is given in Appendix [A} 
4.2 Nonexpansive properties of ASHEs 

In the previous Section, we have shown that computing the envelopes for 
ASHEs can be done efficiently, so that the term Cg in the complexity is in 
O(d^). This section gives some insights on the coupling time. Since our ap- 
proach is general, no tight bound on the coupling can be derived. The most 
immediate property is that, as mentioned in Section [3| the coupling time of 
envelopes is larger than the coupling time of the original chain. However, 
as seen in the experimental section [7j the difference between two coupling 
times is not large. This can be explained in part by several nonexpansive 
properties of ASHEs. In this section we show that the size of the envelope 
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Algorithm 3: Envelope computation for an ASHE a ^ A 

Data: States m and M (current infimum and supremum state); 
An ASHE a with direction vector v and blocking relation TZ; 
Result: [m', M'] = [m, M] □ a (lower and upper envelope), 
begin 

Act = {i : Viy^ 0}; 

CR{m) = {ie Act : rrii + Vi ^ [0, Cj]}; 

CR{M) = {ie Act : Mi + Vi^ [0, Q]}; 

X = {j £ Act : 3i £ CR{m) U CR{M) such that £ 7^}; 

Y = {j £ Act : 3i £ CR{m) n CR{M) such that £ 7^}; 

// Note that Y dX 

for j £ Act\X do 

// This is the case j B{m) U B{M) 

m'j = {rrij + Vj) V A Cj\ M'- = {Mj + Vj) V A Cj; 

for j £ Y do 

^ m'j =mj; M'- = Mj; 

for j £ X\Y do 

if 3i, i 7^ j such that eU andie CR{m) U CR{M) 
then 

// component j is blocked by a component i 

different from j 
if Vj < then 

[_ m'j = {mj +Vj)V 0; Mj = Mj; 

else 

[ lm'j = mj; M'- = {Mj + Vj) A Cj; 
else 

// only j is blocking j 
if f j < then 

L = 0; = (^^- + V (-^.- - 1); 

else 

for j ^ci do 

[_ m'j = mj; M'- = Mj; 
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interval cannot increase more than linearly in the worse case, and does not 
increase (or even contracts) under some conditions on the blocking relation. 
In the latter case, the contracting property also gives some insight of the 
splitting condition. 



Assume that a E ^ is an ASHE given by (4.1). We first give a useful 



lemma about the set of blocked components of an ASHE. 

Lemma 4.5. Let U be a subset of S and let [m, M] = fU}. We have 

[j B{x) = B{m)UB{M). 

Proof. We first prove the inclusion C : Let x G U and j £ B. Then 
there is i G CR{x) such that G TZ. Now i G CR{x) implies that 

i E CR{m)UCR{M) {asrui + Vi < Xi + Vi < Mi + Vi), thus j G B{m)UB{M). 
We now prove the other inclusion. Let j £ B{m), there is a i such that 
G TZ and rrii + Vi ^ [0,Cj]. Since [m,M] = [[[/]], there exist x £ U 
such that Xi = nii, and thus Xi + Vi ^ [0, Cj] and j £ B{x). This shows 
B{m) C U^g^ B{x), and the same reasoning shows B{M) C |Ja;e;7 B{x). □ 

The following proposition shows the nonexpansive property in the case 
without blocking. For m,M £ S such that m < M, denote by [m',M'] = 
[m, M] □ a. 

Proposition 4.6. If for m,M £ S such that m < M, B{m) U B{M) = %, 
then: 

\\M' - m'Wi <\\M - m\\i. 



Proof. If B{m) U B{M) = 0, then by Lemma 4.5, ^x(^[m,M]^'y.^) = ■^("^) 



B{M) = 0. Then ( [4T| ) becomes x ■ a = {x + v) V A C, Mm < x < M, 
that is a monotone function of x (i.e. event a is monotone). Therefore, 
\\M' - m'\\i = \\M ■ a-m- a\\i<\\M -m\\i. □ 

Corollary 4.7. IflZ = ^, then for any m,M £ S such that m < M: 

\\M' - m'Wi < \\M - m\\i. 

The next proposition gives an upper bound in the case where each com- 
ponent can only be blocked by itself. In that case, relation TZ is defined by 
the set B of all auto-blocking components: TZ = ■ i £ B}. 

Proposition 4.8. IfTZ = {{i,i) : i £ B}, for some subset B C {1, ... ,d}, 
then: 

||M'-m'||i < '^{Mi-mi) + '^max{Mi-mi,\vi\-l} 

= \\M -m\\i + ^[\vi\- Mi + mi-l]\/0. 
ieB 
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Proof. Since there are no interactions between components, we can consider 



them separately. Thus, for i ^ B: M- —m^< Mi — nii, as in Proposition 4.6 
For i € B, we have three different cases: 

• For ah X £ [m, M], Xi + Vi G [0, Cj]. This is equivalent to i CR{m) U 
CR{M). Then {x-a)i = Xi + Vi, Vx G [m,M] and M[-m'i = Mi-mi. 

• For all X G [m, M], Xi + Vi [0, Cj]. This is equivalent to i G CR{m) n 
CR{M). Thus (x • a)i = Xi, Mx G [m, M] and M- - m ■ = Mj - m^. 

• There are x,y £ [m, M] such that Xi + Vi G [0, Ci] and yi + Vi [0, Cj]. 
Suppose that > (the case < is symmetrical). Then i G 
CR{M)\CR{m). Denote by x', = Q - w,. Then < x^ < M,. We 
have: 

m', = {rrii + Vi) A (x', + 1), M[ = d. 
There are two cases: 

— rrii + Vi < x'i + \. Then M[ — m'i = Ci — {mi + Vi) = x'j — mi < 
Mi - mi - 1. 

- mi + Vi> x\ + 1. Then M[ - m'. = Q - (x^ + 1) = - 1. 

Thus M[ — m[ < max{Mj — mi, \vi\ — 1}. □ 

The above result tells us that an event without blocking interactions 
between different components (i.e. (i,j) G TZ =^ i = j) is nonexpansive 
in components that are not auto-blocking. For components that are auto- 
blocking such event is nonexpansive when Mi — mi > \vi\ — 1. 

Corollary 4.9. If TZ = {(i,i) : i G B}, for some subset B C {!,..., d}, 

and \vi\ = i £ B, then \\M' — m'\\i < \\M — m||i, m < M. 

We give now an upper valid bound for any ASHE. 



Proposition 4.10. For any ASHE 'a' given by {4-1), we have: 

||M' - m'lli < ||M - m||i llulli - 1. 

Proof. For i G {1, . . . ,d} such that Vi < 0: mi + Vi < (x ■ a)i < Mi, Vx G 
[m, M], and for i G {1, . . . ,d} such that Vi > 0: mi < {x-a)i < Mi + Vi, Vx G 
[m,M]. Thus \\M' - m'\\i < \\M - m\\i + \\v\\i. 

Suppose that for at least one component i £ A, we have two states x and 
y such that Xi + Vi £ [0, Ci] and yi + Vi ^ [0, Ci]. (Otherwise \ \M' — m'\\i < 
\\M — mill.) For this component: (M^ — m[ < Mi — m^ + \vi\ — 1. Indeed, 
suppose that > (the case I'j < is symmetrical) . Then Mi + Vi — l>Ci 
and mi < {z ■ a)i < Ci,\/z £ [m, M]. Thus M^ - m'^ < d - mi < Mi - 1 + 
Vi — mi, so we have: 



|M' - m'lli < ||M - m||i ||v||i - 1. 

□ 
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5 Partition of the state space 



5.1 Zone partition 

In this section we focus on a single event a. The state space is partitioned 
into zones, Z^, - ■ ■ , Z^ , where K is assumed to be small compared to the 
size of the state space. The action a may have a different effect on each 
zone Z^. Formally, the action ' • ' by a is defined for all x G 5 as : x-a = x-a^ 
where k is such that x ^ Z^. To compute the envelopes of the action of a 
over an arbitrary interval [m, M], one can use the partition into zones. 



[m,M] □ a = l[m,M] ■ aj 



C 




(5.1) 



(5.2) 



The construction of these envelopes is illustrated in Figure |4j The first 
and second equalities are direct consequence of the definition of the partition; 
the third one is by definition of the action on a set; the fourth one simply 
uses the fact that HAj U iBjj = {Au Bj. 

The last inclusion ( |5.2[ ) does not hold with equality in general. It uses 
the fact that for any set A, lA- a} C lAJ □ a. It gives us an easy way to 
compute an over-approximation of □. Moreover, in some cases the inclusion 
can be replaced by an equality, such that the over-approximation is exact 
(or tight). We give some simple cases where this equality holds in Section 



The computation of an interval containing [m, M] Ha using (5.2) reduces 
to two problems: 

(Pi) The computation of [[m, M] n Z'^'] . This can be done by integer linear 
programming (see next section). 

(P2) The computation of [[m, M] n Z'^] □ a^. This is easy when is an 
ASHE (using Algorithm |3| or when is monotone (see Equation 
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Figure 4: The construction of envelopes on a simple example. The initial 
interval [m, M] intersects the two zones Z"^ and Z^. We compute the next 



interval [m', M'\ *== [m, M\ □ a by computing ^ = M] r\ Z^ ■ a^. 
~ , M2] = [m, M n Z2 . a] , and finally [m', M'] = [[m\ M^] U [m^, Af^]] 



correspond to (5.1 ), but in this example the equality holds also in the inclu- 
sion K, M'\ = [[[m, M] n Z^j □ a U [[m, M] n Z^] □ a] . 



In Section[6| we will give methods to solve (Pi) by computing |[m, M] n Z^ 
or an over-approximation. 



5.2 Tightness 

Proposition 5.1. Let U he a subset of S and a an ASHE. If a is non- 
blocking on U (i.e. Mx £ U, B{x) = %), then {U ■ a\ = [[/]] □ a. 

Proof. The inclusion [t/ • aj C [[[/J Ha is simple, since by definition \U\ Ha = 
lim-a} and[/C[C/l. 

Let us prove now the other inclusion. Let [m, M] = [[[/]]. Since we 
assumed a to be non-blocking on [/, i.e. UxeC/ -^(^) ~ ^' have for all x in 
[/, X ■ a = (x + w) V A C. Using Lemma |4.5| we also have B{m) U B{M) = 
U^ef/ -B(x) = 0, which gives [m, M] = [(m -|- 1;) V A C, (M -|- u) V A C] 
(see Algorithm |3j in which B{m) U B{M) = implies that X = 0). 

We can now prove that [C/]] □ a C [[[/ • aj. Let [m! , M'] = [[[/]]□ a and 
Q] = [f^ ■ o^l- To prove the inclusion we just need to show that m' > q 
and M' < Q. 

We showed that m' = (m-t-v) VOAC For each component i G {1, . . . , d}, 
there exists x £ U such that Xi = rrii. Let x' = (x -|- u) V A C, we have 
x' £ U -a, thus x' > q, and m[ = rrii + ViM Q /\Ci 
Finally, ml > q, and similarly AI' < Q. 



Xi+ViVOACi = x'i> 



□ 



Corollary 5.2. Let a be a piecewise ASHE over a partition Z^, . . . , Z^ such 
that each ASHE a'^ is non-blocking on its zone Z^ . Then the inclusion (5.2) 
can he replaced by an equality. 
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Remark 5.3. It is easy to see that any ASHE a can be partitioned into 

zones such that a is non-blocking on each zone. Although this is not useful 
for computing the envelope of a simple ASHE (since we already know how 
to compute it), it can however be used to obtain tight envelopes on piecewise 
ASHEs. 

Proposition 5.4. Let U be a subset of S, a an ASHE and v its direction 
vector. If a is unidimensional, (i.e. Act = {i \ Vi 0} is a singleton), and if 

the projection of U on any component i: Ui =^ {xi \x ^U} is an interval, 
then p ■a\ = □ a. 

Proof. The inclusion [[fZ-a]] ^ [[[/]] Ha is always true, and was already 
proved. We keep the same notation as in the previous proof: [m, M] = \U\, 
[m',M'] = \U\ □ a and [q,Q\ = [fZ-aJ. Let i be the component where 
Vi 7^ 0. For all j ^ i, we have m'^ = mj, Mj = Mj, and for all x e U, 
{x ■ a)j = Xj, thus m'j = qj and Mj = Qj clearly. 

On component i, we need to consider several cases (we will show that 
m'j > Qi and < Qi): 

• If i ^ B{m) L)B(M), then the transition is non-blocking on U, and the 
previous Proposition applies. 

• If i G B{m) n B{M), then [m',M'] = [m,M] and U ■ a = U, thus 
[m',M'] = [q,Q]. 

• If i G B(m)\B(M), then rrij < —Vi < M^. In that case, we have = 
and M- = {Mi + Vi) V {—Vi — 1). By assumption, Ui = {xi \ x e U} 
is an interval, and thus Ui = [mi, Mi]. Since rui < —Vi < Mi, there 
exists X £ U such that Xi = —Vi. We then have {x ■ a)i = Xi + Vi = 0, 
and thus qi = = m'-. 

We now prove that Qi > M'- = {Mi + Vi) V {-Vi - 1). Since Ui = 
[mj,Mj], there exists y & U such that yi = Mi, thus Qi > {y ■ a)i = 
yi + Vi = Mi + Vi. Also, since rui < —Vi — 1 < Mj, there exists a z e U 
such that Zi = —Vi — 1. We then have Zi + Vi = —1 < 0, and thus 
i G B{z) and {z ■ a)i = Zi = —Vi — 1, and finally Qi > {z- a)i > —Vi — 1. 

• If z G B{M) \ B{m) a similar reasoning shows the same result. 
We finally obtain the reverse inclusion \U\ □ a C [[/" • a] . 

□ 

6 Polytopic Zones and Linear Programming 

In this section we consider a class of piecewise events for which the compu- 
tation of envelopes can be done in a reasonable amount of time and memory 
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space. We will focus here on one event a assumed to be defined piecewise 
over polytopic zones. 

Again, the state space is a product S = [0, Ci] x • • • x [0, C^], and is here 
partitioned into zones defined by linear constraints. 

Let us consider H hyperplanes of in general position and assume 
they do not intersect S (by translating them by a small e if needed). These 
hyperplanes define a partition of M*^ in K convex polytopes (Pi, 
and these polytopes induce a partition of the state space S in K zones 

= 5 n P' . Without loss of generality, we consider only the nonempty 
zones and assume we have K of them. 

Finally, over each zone Z^, the event is assumed to be easily com- 
putable (e.g. an ASHE). 

There exists a relation between the total number of hyperplanes H 
needed to define the zones and the maximal number of zones that can be 
defined using them. The number K of zones is smaller than 1 + iJ + (^) + 
• • • + (^) . This is because the state space S may intersect all zones (bounded 
or not) formed by H hyperplanes in a general position. The number of such 
zones is given by the value of the characteristic polynomial of the hyper- 
planes at point -1 (see [17]). 

Example 6.1. A queueing example is displayed in Figure^ We consider a 
Join the Shortest Waiting time (JSW) event, that is an arrival that allocates 
the new customer to the queue with the shortest expected waiting time ( after 
arrival). The zones for the JSW event are displayed in Figure^ If (x2 + 
l)//^2 > {xi + l)/(2/ii) -|- e (with e very small) then the new customer is 
sent to queue 1 as long as the queue is not full, in which case it is sent to 
queue 2. If {x2 + l)/^2 < (xi + l)/(2;Ui) -|- e, it is sent to queue 2 as long as 
the queue is not full, in which case it is sent to queue 1. If both queues are 
full, the client is rejected. Here 3 hyperplanes divide the two-dimensional 
state space into 4 zones that are polytopes: {x2 + l)/ii2 = (a^i + l)/(2/ii) +e, 
xi = Ci - I + £, X2 = C2 - I + e. 

Note that in this case, the partition makes the event a non-blocking ASHE 
in each zone, so that Equation (5.2) holds with equality (by Proposition 5.1). 
However, the two zones in the upper right corner could be merged in one 
single zone with a blocking ASHE, and the equality would still hold (see 
Proposition 5.4)- 



6.1 Zone intersection with Linear Programming 

Since we assumed the event o to be easily computable on each zone Z^ the 
only problem left to solve is (Pi) the computation of [m', M'] = M] n Z'^'] 
When the zones Z^ are induced by polytopes P^ {Z^ = 5 H P'^), this com- 
putation amounts to solving 2d integer linear programs, one for each com- 
ponent m,[ and M[ : 
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Figure 5: Example of a queueing event that can be defined with ASHEs on 
polytopic zones. The first queue has two identical servers with rate while 
the second has only one server with rate //2- 




V.E {!,..., 4, r =--{-^l-^^^^^^';^[-'^]}' (6.1) 
[Ml = max{xi \ x £ Z'^ D P'' D [m, M]}. 

Unfortunately, integer linear programs are well known to be NP-hard in 
general. These linear programs can be solved in simple cases that we will 
not detail in this paper. A general approach to overcome this difficulty is to 
relax the integrity condition and to solve the corresponding rational linear 
programs, by computing and in Z*^ defined by 



= [max{xi \ x G Q'^ n P'' n [m, M]q}\ , 



where [m,M](Q denotes the rational interval {x £ \ m < x < M}, that 
contains [m, M]. 

It is easy to see that the discrete interval [q^, Q^] contains [[m, Af] n Z^J . 
We can thus compute an over-approximation of □ for a : 
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Figure 7: Consider a zone and an interval [m, M]. The Figure dis- 
plays the intervals [m, M], [m' , M'] *== 
[q^,Q''] constructed using (|6.2[). 



[m, M] n Z'^] and the larger interval 



K 



[m, M] □ a C IJ |[m, M]nZ^ □ a" 

[k=l 
T K 

[k=l 

Figure [7] gives an illustration on a simple example. 



6.2 Complexity and data structure 

In order to deal with the set of linear programs in an efficient way, they 
can be represented as a binary acyclic graph. Each node corresponds to 
an additional linear constraint. The left child corresponds to the half plane 
where the inequality is negative and the right child to the one where the 
inequality is positive. 

When an intersection can be reached through several paths, the nodes 
are merged. 

The leaves of the graph are all the polytopes and the number of nodes 
in the tree is less than twice the number of leaves. This construction is 
illustrated in Figure [8| 

This representation has a better amortized complexity to compute [m, M]-a 
than an exhaustive resolution for each polytopic zone since the intersection 
of [m, M] with a zone can be found empty early in the tree (this can even 
happen at the root of the tree if [m, M] does not intersect one half plane. 
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Figure 8: A partition and the corresponding graph structure used to sweep 
the zones. Here, the space is partitioned into 4 polytopic zones, ^1,^2,^3 
and Z4, defined by three hyperplanes Hi,H2 and H3. The corresponding 
half-spaces are denoted and , for all i. Hence, for instance, Z'^ = 



H+ n h:^ n H+ 



Sorting the constraints (from the root of the tree to the leaves) to get 
the best possible expected cost of computing [m, M] ■ a can be done for a 
given distribution of m and M (for example uniform over 5 x 5) by using 
dynamic programming. This is not in the scope of this paper. 

The Algorithm |4] defines the procedure Bound(y, m, M, a) such that, 
when V is the graph G, Bound (G, m, M, a) = UaLiI?^' ^ which 
is an over- approximation of [m, M] Ha. It uses the procedure LP-Solve (not 
detailed) that solves linear programs over rational numbers in polynomial 
time. It also contains the test V n [m, M]q 7^ that can be done much 
faster (in linear time, see the next section) and helps to cut the cost: The 
LP-solve procedure is not used when the intersection is empty. 

By solving linear programs with the interior point method [TU] , the com- 
plexity of each main can to Algorithm [4] is 0{(f-^K). 

6.3 Fast bounds using Minkowski sums 

The previous cost to compute over-approximations of envelopes by solving 
linear programs may not be sustainable for chains with a large dimension d 
and when the number of calls to Bound is large. 

In this section we will derive a way to compute looser bounds on the 
envelopes, with a complexity 0{dHK). This technique is based on the com- 
putation of Minkowski sums of all K zones with intervals [m, M] (see |13j). 
Each polytope can be expressed as the intersection of /i^ hyperplanes: 
= {x^W^\ xA^ < h^}, where is a dxhk matrix and b a row vector of 



21 



Algorithm 4: Recursive procedure computing the bounds on the en- 
velopes using linear programming 

Data: Partition of the state space into Z^, . . . , arranged in a 
graph structure G using linear constraints; 



Result: interval fm', M'l 



Bound {V,m,M,a):= 
begin 

Visited(y):= True; 
it V n[m,M]Q / then 
if y = P*^ then 

for J = 1 to d do 

-,k . 



p'=r\[m,Mh¥'1i 



:= \ LP-Solve(min{xj \ x gV n[m, M]q})] ; 
Qj := [ LP-Solve(max{xj \ x e V H [m, 

K, M'] := [K, M'] U [q^, Q^] □ a''} ; 



else 



if Visited(V .left) = False then 
[_ Bound(F.left,m,M,a) 

if Visited(V .right) = False then 
[_ Bound(F.right,m, M, a) 



size hk. Each constraints {^1^ --^x < bj} correspond to one of the hyperplane 
defining P^. 

Testing if an interval [m, M]q of size s M — m intersects a polytope 

can be done by testing if the center of the interval (m + s/2) belongs 
to an extended zone W'^ that is the Minkowski sum of P^ and the interval 
[-s/2, s/2]: W'' P^ e [-s/2, s/2] (the Minkowski sum A®B oi two sets 
A and B is defined hy A® B = {a + b\a e A,b e B]). 

Figure [9] shows an illustration of the Minkowski sum in a simple case. 

It is direct to see that 

[m, M]q r\P^ ^$^m + s/2eW^. 
and that the set is a polytopic zone defined by hk + 2d hyperplanes: 
ly'^ = {x E I xA'' < b''+{s/2)\A''\}n{x £R'^\x> &-s/2,x < L^+s/2}, 

where \A^\ = {\Al^\) i<i<d,i<j<hk is the component-wise absolute value of 
matrix A'', and and are defined in Q'^ by 
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Figure 9: The Minkowski sum of a polytope and the interval [— s/2, s/2], 
s being the size of the interval [m, M]. 



Note that and do not depend on [m, M] and thus can be pre- 
computed with linear programming before starting the simulation. 

Testing if the center of [m, M] belongs to W'^ simply amounts to testing if 
the center satisfies all inequalities defining . This is done in time 0{dhk), 
and H being the total number of hyperplanes used to define the polytopes, 
we have < H. Therefore, the test [m, M] n 7^ in Algorithm [4] can 
be done in time 0{dH) for each polytope . 

The previous interval [q^ , Q^] computed with linear programs (solvable 
in 0{d^)) can be replaced by a looser over-approximation of ^Z^ n [m, M]| 
computable in time 0{2d). 

Indeed, since P^ C [&,L^]q, we have 

Z'=n[m,M]| C [\l^},[L''Wr\[m,M] = [my \&},M ^[L^\i 

where \^^^^ (resp. \_L^\ ) is the component-wise ceiling (resp. floor) of & 
(resp. V') and is pre-computed. The intervals [m V \£^^^,M A \_L^W can 
replace the intervals [q^ , in Algorithm |4| and are easier to compute. The 
complexity of Algorithm [4] would then be 0{dHK). However, these intervals 
provide looser bounds on the envelopes and might thus increase the coupling 
time of the bounding processes, at worst to infinity. As we pointed out 
earlier, this tradeoff between coupling time and complexity of computations 
needs to be adjusted depending on the Markov chain to simulate. 
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7 Queueing Network Examples 



In this section, we illustrate the usefulness and the expressive power of 
piecewise ASHEs by showing examples of queueing networks that can be 
described using Markov automatons with piecewise ASHEs over polytopic 
zones. 

The rule of thumb is that many classical (and some non-classical) queue- 
ing networks are such chains so that sampling their stationary distribution 
is made possible using ASHES over polytopic zones, while the classical PSA 
would be intractable in practice. 

7.1 Jackson Network with finite capacities 

We consider open queueing systems similar to Jackson networks except that 
queues have finite capacities. The network is formed by d queues numbered 
from 1 to d: Each queue i is a M/M/l/Cj queue with finite capacity Cj 
and a FIFO server. Entering customers arrive in each server i according 
to a Poisson process of parameter Aj, and each server is assumed to have 
exponential service time /Xj. After being served at queue i, a customer is di- 
rected to queue j with probability Pij or leaves the network with probability 

Pi,o *== 1 — J2j=i PiJ- ^ill consider two types of blocking policy: 

• Client Loss (CL): If a client served by queue i is sent to a full queue, 
it is lost. 

• Restart Service (RS): If a client served by queue i is sent to a full 
queue, it restarts its service at queue i. 

We assume that new clients arriving in the network from outside are lost 
when sent to a full queue. The state of the network is entirely described by 
the number of customers in each queue, and the corresponding Markov chain 
can be generated with the following Markov automaton A = (5, A, D, •) 
where S = [0, Ci] x • • • x [0,^], A = {aij\0 < i,j < d}, D{aij) = 
^aij / YlaeA where Aa the frequency rate of each event a is defined as: 

, ■■= l^iPi,j Vz, j G [1, . . . , d] 

and where the aij are ASHEs with direction vector Va^^ = ej — for all 
i,j G {0, . . . , d} (with the convention cq = (0, . . . , 0)), and with a blocking 
relation T^-o, ^ depending on the blocking policy used: T^af". = {{hj)} for the 
Client Loss policy and 'R-afj = {(hj), {j,^)} for the Service Restart policy. 
Note that when i = or j = 0, the choice of TZ has no incidence. The 
blocking policy may also differ for each couple 



24 



It is easy to check that the events of this Markov automaton are all 
monotone for the product order on 5 = [0, Ci] x [0,(7^]. Therefore for this 
type of networks, the coupling times for PSA and EPSA are equal. 



7.2 Examples of non-monotone ASHEs 

In this section, we consider other types of routing events that are not mono- 
tone nor anti-monotone and can be described with ASHEs. These events 
can easily be added to any Jackson networks with finite capacity, and they 
can easily be combined with each other to give other types of ASHEs. 

Envelopes for these events were already computed in pj using ad-hoc 
methods, while the ASHE framework now embraces them all. The numerical 
results for these events are taken from [2j. 



Fork and Join A common type of non-monotone event used in queueing 
network is fork and join. A fork queue is such that, at service completion, 
several customers are sent simultaneously to several disjoint output queues. 
A join is a queue with several input buffers. The server is only active when 
no buffer is empty. A service completion removes one customer in each input 
buffer and sends one (merged) customer to an output buffer. 

Fork events were already introduced in Example |4.3| and shown to be 
ASHEs. A join event with two input buffers i and j and output buffer k is 
described by an ASHE with direction vector v = — Ci — ej and blocking 
relation depending on the blocking policy: TZ'-^^ = {{i, k), {j, k), (j, i)} 
or 7^«^ = {{i, k), {j, k), (ij), {j, i), {k, i), {k,j)}. 

Join events are not monotone for the natural product order: Consider for 
instance the join event a with i = 1, j = 2 and /c = 3 in a 3-server network, 
and consider the two states x = (0, 1, 0) and y = (1, 1, 0). Then x < y while 
X • a = (0, 1, 0) and y ■ a = (0, 0, 1) are incomparable. Similarly, it is easy to 
show that fork is in general not monotone. 



Negative customers We consider the additional events corresponding to 
negative customers. After queue i finishes a service, a negative customer is 
sent to queue j and kills one customer in that queue, if there is any. This 
event can be represented by an ASHE of direction vector v = —Ci — Cj and 
blocking relation IZ = {(i, j')}. Such events are not monotone for the natural 
product order. 

We ran simulations for the small network displayed in Figure [TT} The 
average coupling time of PSA and EPSA are reported in Figure 10 (in the 
following all data are given with 95 % confidence intervals). 

In the example, the coupling time of EPSA is around 3 times larger 
than PSA. Since the number of states is 15^ = 225, this means that EPSA 
is about 225/6 = 40 times faster than PSA in this case. 
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Figure 10: Mean coupling times of PSA and EPSA algorithms for the net- 



work in Figure 11 as a function of A2 with 95 % confidence intervals. 
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Figure 11: A network with negative customers. 



Batch routing Here, we consider batch routing: After the end of ser- 
vice in queue i, a batch of K customers leaves queue i and creates L new 
customers in queue j. In general, K and L do not need to be equal: for 
instance 10 products that are assembled into one {e.g. put into a box), 
and then sent to next phase of a production line can be modeled by taking 
K = 10 and L = 1. This event can be represented by an ASHE of direction 
vector V = —Kei + Lcj and blocking relation depending on the blocking pol- 
icy: -RP^ = or 7^^■5 = As before, the 
exogenous arrivals (resp. services) can be modeled by setting i = (resp. 
j = 0). We allow partial arrivals to queue j (in the case of IZ^^), while 
the service must be complete (i.e. at least K customers must be present in 
queue i). However, partial service can also be modeled by piecewise ASHEs, 



using similar ideas as for multiple servers in Section 7.3 



Figure 12 reports the coupling time of EPSA for a single queue with 
batch arrivals of size K = 1,2,3 or 5 with rate A, departures of size 1 
and rate 1, and blocking policy TZ = {(1,1)}. Unlike what happens for 
negative customers and fork-join nodes, the coupling time of EPSA increases 
exponentially fast with the load of the queue (the scale being logarithmic). 
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Such increase is due to the fact that envelopes can only couple after they 
both hit 0. The time for the upper envelope starting in C to hit when the 
load p is larger than 1 becomes exponentially large (larger than e''^). 

One way to prevent this is to use the Split algorithm [21 Algorithm 3], 
that is an hybrid between PSA and EPSA: It starts by computing bounding 
intervals as EPSA until the size of the interval becomes smaller than the size 
of the largest batch. When this happens, the bounding interval is then split 
into all the single trajectories contained in it, and these normal trajectories 
are then computed as in PSA. The advantage of the Split algorithm is that 
these normal trajectories as not as numerous as the |5| trajectories needed 
for the classical PSA, and it is guaranteed to terminate when PSA does 
unlike EPSA. 

Figure 13 reports the coupling times of EPSA, PSA and Split for a queue 
with capacity 20, arrival batches +2 and +3 with probabilities 0.49 and 0.51, 
and departures of size one (with rate 1). In this figure, the coupling time 
of PSA and Split are similar while EPSA's coupling time grows very fast 
when \/ p. is larger than one. In this case Split has a clear advantage over 
the other two algorithms: It couples exponentially faster than EPSA and 
deals with two envelopes instead of C trajectories for PSA. The efficiency of 
Split can be partially explained by the choice of the splitting point: as long 
as the size of the bounding interval is larger than K (the size of a batch), 
the batch arrival event is nonexpansive. This is a direct consequence of 
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Figure 12: Mean coupling times for EPSA in a —1) queue with batch 
arrivals of size k = 1,2,3 and 5 respectively, as a function of the arrival 
rate A. 
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Figure 13: Mean coupling times for PSA, EPSA and Split for a (+2, +3, — 1) 
queue. 



7.3 Examples of Piecewise ASHEs 

We now consider even more complex queueing events that will prove to be 
piecewise ASHEs over polytopic zones. 

Multiple Servers Let us assume now that queue z is a M/M/C/C queue 
with finite capacity C and C identical servers with serving rate fj,, and that 
customers are sent to queue j after being served. Such type of service can be 
described using C piecewise ASHEs ai, . . . ,ac as illustrated in Figure [141 

The state space of queue i is [0, C] and the transition rate matrix Q 
associated to this event is such that Q{x,x — 1) = x/i, the other coordinates 
being null. We add C events ai, . . . ,ac with each defined as the following 
piecewise ASHE: on zone [0, A; — 1], is the identity {v = 0, TZ = 9) and 
on [A:,C], is the ASHE with direction vector {v = Cj — ei) and blocking 
relation IZ = Finally, we set A^;, = /i, and take D{ak) = ^a^/J^aeA -^a- 

This construction can be easily adapted to any queue with n < C servers, 
which can be represented with n piecewise ASHEs. It can also be easily 
combined with any other type of events discussed previously. 



Join the Shortest Waiting Time (JSW) and other routing policies 

JSW is described in Example 6.1 and is shown to be a piecewise ASHE over 
polytopic zones. 

More generally, consider the routing event rj where a client, after being 
served at queue i, is sent to queue j = R{x) for some function i? : 5 — )• [0,d] 
(if j = the client leaves the system), and the client is either lost (CL) 
or restarts its service (RS) if the destination queue is full. Such event can 
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(a) Transition rate matrix 




(b) Corresponding Markov automaton 



Figure 14: Representation of tiic transition rate matrix (a) of the multiscrver 
M/M/C/C queue, and its corresponding description as a Markov automaton 
(b) using C piecewise ASHEs ai, . . . , ac, each being drawn with rate /x. 

be described as a piecewise ASHE over at most d zones = R~^{k) for 
k = 1, . . . ,d with the event on being an ASHE with direction vector 
Ck — Cj and blocking relation TZ'^'^{{i, k)} or TZ^^ = {(i, k), {k, i)} depending 
on the blocking policy. Provided the zones are polytopic, it is then easy 
to compute over-approximation of envelopes for this event. For some non- 
polytopic zones Z, the interval [[[m,M] Ci Zj can still be easily computed 
with had-oc methods. For instance, if we consider the routing by index 
R{x) = argminjg|o,...,(i}/i(^«) fo^' some strictly increasing real functions fi 
(we assume the fi functions are chosen such that the arg min is unique for 
any x G S), then the corresponding zone is of the form = R~^{k) = 



{x I Vj G {!,..., d}, fk{xk) ^ fjixj)}^ one can show that [m',M'] = 
[[m,M] nZ^] is given by m'^ = nik, M'^ = f^\mmi^^o,...,d} M^i))' and 
Vi ^k,m'i = f,~\fkimk)) V rm and = Mj. 



Of course, routing policies can be combined with all the previous types 
of event (forks and joins, negative customers, batches, multiple servers) to 
give more types of piecewise ASHEs. 

7.4 Application to a network 

Here is an example of a network where events are ASHEs or piecewise ASHEs 
over polytopic zones. The following model permits us to compare the routing 
policies JSW to the Random (Rand) routing. 

In this model, we consider two parallel queues 1 , 2 with same capacity 
Ci = C2 = 20 and one server with service rate /ii,/Lt2, respectively. We 
introduce // = /Ui + //2 and a = Hi/ ji such that /xi = O/U and = (1 — a)/x. 
Clients arrive in the system at a rate A = 1 and go in one of the two queues, 
according to the chosen routing policies. 

To compare the two policies, the 2-queue system is duplicated to give 
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a 4- queue-network, with two 2-queue subsystems with coupled arrivals and 



departures (see Figure 15). Incoming clients are duplicated via a fork and 
sent to each subsystem. If one of the subsystem rejects its copy of the client, 
the other must reject it as well. This is to ensure that both systems always 
receive the same quantity of clients. 




- Ci 


= 20 






= 20 




c\ 


= 20 




C2 


= 20 



Joint Departures 




Joint Departures 



Figure 15: Network to compare the JSW and the Random routing policies 



The first system then routes its clients using the JSW policy, while 
the second system sends its clients randomly to queue 1 with probability 
^j-^^l^^^ and to queue 2 with remaining probability. If the packet arrives 
in a full queue, it is rejected (and the client is rejected by the other sub- 
system as well). The departures are also coupled: with rate /ii [resp. 
a packet leaves queue 1 [resp. 2] (if the queue is not already empty), simul- 
taneously for each subsystem. When the queue is empty in one subsystem, 
the corresponding queue in the other subsystem can still serve its clients. 



Figure 16 shows the results of the simulation of that network using the 
envelope technique. We plotted the average difference of clients in the queue 
between the two subsystems, when the ratio a varies between 0.1 and 0.9, 
and for several values of the network load p = A//i. 

As we can see, the JSW routing strategy performs better than the ran- 
dom routing in all cases, and particularly when the serving rates are not 
close from each other or when the workload is high. 

The coupling time is much lower for low load. In that case the dominant 
events are non-blocking services. If the load is high, the dominant events 
are arrivals that are blocking. Further, the coupling can only be achieved 
by hitting the lower boundary of the state space (by emptying the system), 
so the coupling time grows exponentially with the load. 



8 Conclusion 



Perfect sampling can be a very efficient method to study the stationary 
behavior of a Markov chain. This efficiency relies on three conditions. The 
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0.2 0.4 0.6 0.8 1 ^ 

a 

Figure 16: On the left: load difference between the two subsystems (Differ- 
ence between the average number of clients in the random routing servers 
and the JSW servers) when a varies and for several value of the network 
load p. On the right: corresponding coupling time (number of iteration). 
Note that the scale is logarithmic in the number of iterations. For each point 
of the figure, 10000 simulations were run. 

first one concerns the small number of simulated trajectories needed for 
unbiased coupling. The second condition is the requirement that a small 
time should be needed to compute those trajectories and the third condition 
is to have a small coupling time. When these three conditions are satisfied, 
the perfect sampling technique is hard to beat to compute the stationary 
distributed with a given confidence interval. 

In this paper we show that the first condition is easy to meet by using 
envelopes. We also show that the second condition can be met for piecewise 
space homogeneous chains by using a special type of events (ASHEs) and 
by over approximating the envelopes. The third condition is more chain 
dependent. We provide several examples where coupling using the envelope 
technique happens fast and others where the coupling time is very large. 
In those cases, the splitting technique can help reduce the coupling time 
without compromising with the other two points. 
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A Proof of Theorem 4.4 



Denote [m',M'] = [m, M] □ a. Consider a fixed component j. We have the 
following different cases: 

1. j ^x<^[m,M]B{x) (component j is not blocked in any state x, m < 
X < M). By Lemma 4.5, this is equivalent to j B{m)L)B{M). Then: 



[m'j,Mj] = [{mj + Vj) V A Cj, {Mj + vj) V A Cj] 



2- j £ f^x&[m,M]B{x) (component j is blocked in all states x, m < x < 
M). It is easy to see that this occurs if and only if there is z G 
CR{m)r\CR{M) such that (i, j) G 7^. Then {x-a)j = Xj, m < x < M 
and: 

[m;.,Mj] = [mj, Mj]. 

3. j G {U.^(z[m,M]^(^))\('^xe[m,M]^(^)) (component j is blocked in some 
but not all states x, m < x < M). Then we have the following cases: 

(a) There is i / j such that {i,j) G 71 and « G CR{m) U CR{M) 
(component j is blocked by a component i different than j). 
If Vj < 0, then 7^ or j CR{M) (remark that G TZ 
and j G CR{M) would imply j G CR{m) n CR{M) and thus 
j e n^g[^^A,/]-B(x)). We have: 

[m;.,Mj] = [(mj + i;,-) V 0, M,-] . 

If Vj > 0, then similarly (j, j) ^ TZ or j ^ CR{m) and: 

[m'j,M'j\ = [mj, {Mj + Vj) ^ Cj] . 
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(b) Otherwise (only j is blocking j). Then G TZ and j G 

CR{m)U CR{M). 

If Vj < 0, then j £ CR{m)\CR{M) {i.e. rrij < -vj < Mj) and: 
[m;.,Mj] = [0, {M,+v,)V{-vj-l)]. 

Note that —Vj — 1 = max{xj | x £ [m, M], j G B{x)}. 

Uvj > 0, then j G CR{M)\CR{m) {i.e. Cj-Mj < vj < Cj-mj) 

and: 

[m'j,M!j] = [{rrij + vj) A {Cj - vj + 1), Cj] . 

For a given component j, the complexity of computing [m'j, Mj] amounts 
to determining the corresponding case. However, this can be precomputed 
using 0(|7?.|) = 0{cP) elementary operations. Thus the complexity of Algo- 
rithm [s] is 0{(f). 



34 



